Fix SDPO teacher prompt alignment + dashboard visibility#35
Conversation
The teacher prompt was missing the `<|im_start|>assistant\n` prefix before response tokens after the refactor to pre-tokenized data flow. This caused strongly negative advantages for early response tokens, regressing concise preference compliance. - Set `add_generation_prompt=True` in both Tinker and local/Modal teacher prompt construction to match student prompt alignment - Decode and emit `teacher_scored_texts` in distill metadata for all three backends (Tinker, local, Modal) - Render teacher scored text per-sample in the feedback dashboard - Add tests for prompt alignment, metadata field, and dashboard rendering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe changes extend the feedback system to propagate and display per-sample "Teacher Scored Text" throughout the pipeline. Teacher-scored text is computed during distillation, passed through batch processing in the engine, and rendered in the dashboard's expandable detail sections for each sample. Changes
Sequence Diagram(s)sequenceDiagram
participant Distillation as Distillation Module
participant Engine as Engine Module
participant Rendering as Dashboard Rendering
Distillation->>Distillation: _build_self_teacher_topk(prompt, feedback, response_ids)
Distillation->>Distillation: Decode teacher inputIds + responseIds
Distillation->>Distillation: Compute logprobs, indices, teacher_scored_text
Distillation->>Distillation: Return (logprobs, indices, teacher_scored_text)
Engine->>Engine: _build_sample_datum(sample, tokenizer, ...)
Engine->>Distillation: Call _build_self_teacher_topk()
Distillation-->>Engine: Return (logprobs, indices, teacher_scored_text)
Engine->>Engine: Collect teacher_scored_texts per sample
Engine->>Engine: Create metrics dict with teacher_scored_text
Engine->>Engine: Build DistillResponse with teacher_scored_texts in metadata
Engine->>Rendering: Provide metrics_payload with teacher_scored_texts
Rendering->>Rendering: For each sample, read teacher_scored_text from metrics
Rendering->>Rendering: Guard against missing text (default to empty list)
Rendering->>Rendering: Build teacher_section HTML if text exists
Rendering->>Rendering: Inject teacher_section into sample detail panel
Rendering-->>Rendering: Render dashboard with Teacher Scored Text
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claas/training/distillation.py (1)
171-181:⚠️ Potential issue | 🟡 MinorDocstring still says "Pair" after return type extended to a triple.
The
Returnssection should document the new third elementteacher_scored_text.📝 Proposed fix
Returns: - Pair of top-k logprobs and indices for each response token. + Triple of (top-k logprobs, top-k indices, teacher_scored_text) for each response token. + teacher_scored_text is the decoded full teacher input (prompt + response) with special tokens retained.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/distillation.py` around lines 171 - 181, Update the docstring for the function that "Build top-k teacher logits from the frozen base model" (the function with the current Returns documenting a pair) to reflect that it now returns a triple: (top-k logprobs, top-k indices, teacher_scored_text); specifically, change the Returns section to list all three elements and briefly describe each item (top_k logprobs, top_k indices per response token, and teacher_scored_text containing the teacher's scored/annotated text for the response) so the documentation matches the updated return value.
🧹 Nitpick comments (2)
claas/training/engine/tinker/engine.py (1)
351-360: Type annotationdict[str, float | str]omitsintforcompletion_len.
completion_lenis assigned anint(line 280), which falls outsidefloat | str. Consider widening todict[str, float | int | str]for precision.🔧 Proposed fix
- metrics: dict[str, float | str] = { + metrics: dict[str, float | int | str] = {And correspondingly update the function return type annotation at line 269:
-) -> tuple[T.Datum, dict[str, float | str]]: +) -> tuple[T.Datum, dict[str, float | int | str]]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/engine/tinker/engine.py` around lines 351 - 360, The metrics dict's type annotation only allows float | str but completion_len is an int; change the annotation for the "metrics" variable from dict[str, float | str] to dict[str, float | int | str] and also update the enclosing function's return type annotation (the function that defines metrics and currently declares its return type at the top of the function) to include int alongside float and str so the returned metrics type matches completion_len's actual type.claas/training/distillation.py (1)
199-215: GPU→CPU sync via.tolist()occurs before the forward pass.Line 200 calls
teacher_full_ids[0].tolist(), which forces a CUDA device sync at that point. The forward pass (self.base_model(input_ids=teacher_full_ids)) at line 205 still needs the GPU tensor and can't start until the sync completes. Moving the decode to after thewith torch.no_grad()block but beforedel teacher_full_idsallows the GPU to launch the forward pass without waiting for the CPU roundtrip.⚡ Proposed reordering
teacher_full_ids = torch.cat([teacher_prompt_ids, response_ids], dim=-1) - teacher_scored_text = self.tokenizer.decode(teacher_full_ids[0].tolist(), skip_special_tokens=False) teacher_resp_start = teacher_prompt_ids.shape[-1] response_token_count = response_ids.shape[-1] with torch.no_grad(): teacher_output = self.base_model(input_ids=teacher_full_ids) teacher_logits = teacher_output.logits[:, teacher_resp_start - 1 : -1, :] log_probs = self.functional.log_softmax(teacher_logits, dim=-1) vocab_size = log_probs.shape[-1] k = min(max(1, top_k), vocab_size) top_logprobs, top_indices = torch.topk(log_probs[0, :response_token_count], k=k, dim=-1) del teacher_output, teacher_logits, log_probs + teacher_scored_text = self.tokenizer.decode(teacher_full_ids[0].tolist(), skip_special_tokens=False) del teacher_full_ids, teacher_prompt_ids torch.cuda.empty_cache() return top_logprobs, top_indices, teacher_scored_text🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/distillation.py` around lines 199 - 215, The current call to tokenizer.decode(teacher_full_ids[0].tolist()) forces a CUDA->CPU sync before running the forward pass; move the decode of teacher_full_ids into the no_grad block after running self.base_model(...) (i.e., perform teacher_output = self.base_model(input_ids=teacher_full_ids) and compute teacher_logits/log_probs/topk first, then call teacher_scored_text = self.tokenizer.decode(teacher_full_ids[0].tolist(), skip_special_tokens=False) before deleting teacher_full_ids) so the forward can launch on GPU without waiting for the tolist() CPU roundtrip.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@claas/training/distillation.py`:
- Around line 171-181: Update the docstring for the function that "Build top-k
teacher logits from the frozen base model" (the function with the current
Returns documenting a pair) to reflect that it now returns a triple: (top-k
logprobs, top-k indices, teacher_scored_text); specifically, change the Returns
section to list all three elements and briefly describe each item (top_k
logprobs, top_k indices per response token, and teacher_scored_text containing
the teacher's scored/annotated text for the response) so the documentation
matches the updated return value.
---
Nitpick comments:
In `@claas/training/distillation.py`:
- Around line 199-215: The current call to
tokenizer.decode(teacher_full_ids[0].tolist()) forces a CUDA->CPU sync before
running the forward pass; move the decode of teacher_full_ids into the no_grad
block after running self.base_model(...) (i.e., perform teacher_output =
self.base_model(input_ids=teacher_full_ids) and compute
teacher_logits/log_probs/topk first, then call teacher_scored_text =
self.tokenizer.decode(teacher_full_ids[0].tolist(), skip_special_tokens=False)
before deleting teacher_full_ids) so the forward can launch on GPU without
waiting for the tolist() CPU roundtrip.
In `@claas/training/engine/tinker/engine.py`:
- Around line 351-360: The metrics dict's type annotation only allows float |
str but completion_len is an int; change the annotation for the "metrics"
variable from dict[str, float | str] to dict[str, float | int | str] and also
update the enclosing function's return type annotation (the function that
defines metrics and currently declares its return type at the top of the
function) to include int alongside float and str so the returned metrics type
matches completion_len's actual type.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (5)
claas/dashboard/rendering.pyclaas/training/distillation.pyclaas/training/engine/tinker/engine.pytests/test_api.pytests/test_tinker_engine.py
Match the hindsight context block from the online SDPO trainer in lasgroup/user_interactions to use the paper's exact framing: "=== HINDSIGHT CONTEXT ===" with "future user message" phrasing instead of the previous "feedback from your earlier attempt" wording. Add paper and repo references to README and source docstrings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The teacher was always scoring under the generic DEFAULT_SYSTEM_PROMPT
("You are a helpful assistant.") regardless of what system prompt the
student saw. This corrupted the SDPO advantage signal when requests
routed through OpenClaw with a rich agent system prompt.
- Store system_prompt in CompletionCacheEntry from chat completion requests
- Make system_prompt a required field on DistillBatchItem (no default)
- Make system_prompt a required kwarg on build_teacher_messages, removing
the DEFAULT_SYSTEM_PROMPT fallback
- Validate system_prompt presence in /v1/feedback (422 if missing)
- Thread system_prompt through distillation.py and tinker/engine.py
- Update all tests and JSON fixtures
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@codex review |
claas/inference/cache.py
Outdated
| response_token_ids: list[int], | ||
| prompt_token_ids: list[int], | ||
| response_logprobs: list[float] | None, | ||
| system_prompt: str | None = None, |
There was a problem hiding this comment.
this should not be optional
There was a problem hiding this comment.
Fixed in a7e23bc — system_prompt is now str (required, no default) in CompletionCacheEntry. The API defaults to "" when no system messages are present, and the feedback endpoint rejects empty system prompts with a 422.
Comment generated by Claude Code
There was a problem hiding this comment.
🧹 Nitpick comments (5)
claas/training/engine/tinker/engine.py (2)
292-292: Long line (>120 chars) — consider wrapping.- teacher_messages = build_teacher_messages(teacher_prompt_source, sample.feedback, system_prompt=sample.system_prompt) + teacher_messages = build_teacher_messages( + teacher_prompt_source, sample.feedback, system_prompt=sample.system_prompt + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/engine/tinker/engine.py` at line 292, The call to build_teacher_messages is longer than 120 chars; rewrite it to wrap arguments across multiple lines so it stays under the line-length limit. Locate the call to build_teacher_messages (using teacher_prompt_source, sample.feedback, system_prompt=sample.system_prompt) and format it like a multi-line function call with each argument on its own indented line, preserving the same argument order and names.
237-255: Widening the metrics dict tofloat | strweakens type safety on aggregation helpers.The
avglambda (Line 238) andsum(m["completion_len"] ...)(Line 237) operate ondict[str, float | str]values, so the type checker can't guarantee they receive numerics. This won't break at runtime (only numeric keys are passed), but it may produce type-checker warnings.A clean alternative: keep the metrics dict as
dict[str, float]and returnteacher_scored_textas a separate value in the tuple.♻️ Suggested refactor
async def _build_sample_datum( sample: DistillBatchItem, tokenizer: Any, teacher_sampling: Any, kl_coef: float, -) -> tuple[T.Datum, dict[str, float | str]]: +) -> tuple[T.Datum, dict[str, float], str]:Then unpack in
distill():- datums = [r[0] for r in results] - sample_metrics = [r[1] for r in results] + datums = [r[0] for r in results] + sample_metrics = [r[1] for r in results] + teacher_scored_texts = [r[2] for r in results]And use
teacher_scored_textsdirectly in the metadata instead of extracting fromsample_metrics.Also applies to: 351-359
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/engine/tinker/engine.py` around lines 237 - 255, The metrics dict is currently typed to allow strings which weakens type checking for numeric aggregations (used by the avg lambda and sum over sample_metrics); change the producer that builds sample_metrics (the function that returns sample_metrics inside distill()/wherever it is created) to return a tuple of (list[dict[str, float]] metrics, list[str] teacher_scored_texts) so sample_metrics can be typed dict[str, float] and aggregations are type-safe, then unpack that tuple where used (in distill/around the use of sample_metrics and the metadata block) and set metadata["teacher_scored_texts"] = teacher_scored_texts; apply the same refactor for the similar block around lines 351-359 to keep numeric metrics strictly float and teacher text separate.claas/api.py (1)
235-236: Multiple system messages are silently concatenated.If a request contains more than one
systemmessage, they're joined with\n. This is a reasonable default, but it may silently alter the system prompt the teacher sees compared to what the model actually received (some backends treat multiple system messages differently). Consider whether a single-system-message constraint or a log warning would be appropriate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/api.py` around lines 235 - 236, The code currently concatenates multiple system messages into system_prompt by building system_parts from messages and joining them; change this to detect when len(system_parts) > 1 and handle it explicitly: either enforce a single system message by raising a clear error (ValueError) or emit a visible warning via the module logger (e.g., logger.warning) before keeping the existing join behavior; update the logic around system_parts/system_prompt to perform this check and provide a clear message referencing messages, system_parts and system_prompt so callers see when multiple system messages are present.claas/training/distillation.py (1)
202-202:teacher_scored_textcan be large — consider truncation for log storage.The full decoded teacher prompt + response is stored per-sample in the metadata and ultimately written to feedback logs. For long conversations this could substantially inflate log sizes. A truncated preview (e.g., first/last N characters) might be more practical for the dashboard, with the full text available on demand.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claas/training/distillation.py` at line 202, Replace the direct full decode assigned to teacher_scored_text (the line using self.tokenizer.decode(teacher_full_ids[0].tolist(), ...)) with logic that decodes the full text into a separate variable (e.g., teacher_full_text = self.tokenizer.decode(...)) and then creates a truncated preview (e.g., teacher_scored_text = truncate_preview(teacher_full_text, max_chars=200) which keeps the first/last N chars with an ellipsis). Keep teacher_full_text available if needed elsewhere (store as teacher_full_text or only persist it on demand) and use teacher_scored_text (the preview) for metadata/log writing to avoid bloating logs.tests/test_teacher_helpers.py (1)
33-40:test_system_prompt_matches_inputlargely duplicatestest_custom_system_prompt.Both tests verify that
msgs[0]["content"]equals the providedsystem_prompt— they differ only in the string value used. Consider consolidating into a single parametrized test if you want to keep both cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_teacher_helpers.py` around lines 33 - 40, The two tests test_custom_system_prompt and test_system_prompt_matches_input duplicate coverage of build_teacher_messages for system_prompt; consolidate them into a single parametrized test or remove one. Update the tests to call build_teacher_messages("x", system_prompt=...) with multiple inputs (e.g., "Be concise." and the pirate string) using pytest.mark.parametrize or keep only the more descriptive test_system_prompt_matches_input, ensuring the assertion still checks msgs[0]["content"] equals the provided system_prompt.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@claas/api.py`:
- Around line 235-236: The code currently concatenates multiple system messages
into system_prompt by building system_parts from messages and joining them;
change this to detect when len(system_parts) > 1 and handle it explicitly:
either enforce a single system message by raising a clear error (ValueError) or
emit a visible warning via the module logger (e.g., logger.warning) before
keeping the existing join behavior; update the logic around
system_parts/system_prompt to perform this check and provide a clear message
referencing messages, system_parts and system_prompt so callers see when
multiple system messages are present.
In `@claas/training/distillation.py`:
- Line 202: Replace the direct full decode assigned to teacher_scored_text (the
line using self.tokenizer.decode(teacher_full_ids[0].tolist(), ...)) with logic
that decodes the full text into a separate variable (e.g., teacher_full_text =
self.tokenizer.decode(...)) and then creates a truncated preview (e.g.,
teacher_scored_text = truncate_preview(teacher_full_text, max_chars=200) which
keeps the first/last N chars with an ellipsis). Keep teacher_full_text available
if needed elsewhere (store as teacher_full_text or only persist it on demand)
and use teacher_scored_text (the preview) for metadata/log writing to avoid
bloating logs.
In `@claas/training/engine/tinker/engine.py`:
- Line 292: The call to build_teacher_messages is longer than 120 chars; rewrite
it to wrap arguments across multiple lines so it stays under the line-length
limit. Locate the call to build_teacher_messages (using teacher_prompt_source,
sample.feedback, system_prompt=sample.system_prompt) and format it like a
multi-line function call with each argument on its own indented line, preserving
the same argument order and names.
- Around line 237-255: The metrics dict is currently typed to allow strings
which weakens type checking for numeric aggregations (used by the avg lambda and
sum over sample_metrics); change the producer that builds sample_metrics (the
function that returns sample_metrics inside distill()/wherever it is created) to
return a tuple of (list[dict[str, float]] metrics, list[str]
teacher_scored_texts) so sample_metrics can be typed dict[str, float] and
aggregations are type-safe, then unpack that tuple where used (in distill/around
the use of sample_metrics and the metadata block) and set
metadata["teacher_scored_texts"] = teacher_scored_texts; apply the same refactor
for the similar block around lines 351-359 to keep numeric metrics strictly
float and teacher text separate.
In `@tests/test_teacher_helpers.py`:
- Around line 33-40: The two tests test_custom_system_prompt and
test_system_prompt_matches_input duplicate coverage of build_teacher_messages
for system_prompt; consolidate them into a single parametrized test or remove
one. Update the tests to call build_teacher_messages("x", system_prompt=...)
with multiple inputs (e.g., "Be concise." and the pirate string) using
pytest.mark.parametrize or keep only the more descriptive
test_system_prompt_matches_input, ensuring the assertion still checks
msgs[0]["content"] equals the provided system_prompt.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (13)
README.mdclaas/api.pyclaas/core/types.pyclaas/inference/cache.pyclaas/training/distillation.pyclaas/training/engine/tinker/engine.pyclaas/training/teacher_helpers.pytests/integration/test_local_engine_integration.pytests/integration/test_tinker_stack_integration.pytests/test_api.pytests/test_local_training_engine.pytests/test_teacher_helpers.pytests/test_tinker_engine.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91af7c491a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Make system_prompt required (str, not optional) in CompletionCacheEntry - Fix docstring: "Pair" → "Triple" for _build_self_teacher_topk return - Move tokenizer.decode after forward pass to avoid CUDA→CPU sync stall - Wrap long build_teacher_messages call in tinker engine - Consolidate duplicate system_prompt tests into parametrized test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7e23bc106
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "not a nested chat template." | ||
| ), | ||
| ) | ||
| system_prompt: str = Field( |
There was a problem hiding this comment.
Keep DistillBatchItem compatible with older feedback logs
Making system_prompt required causes previously written feedback log files (which do not have this field in batch_samples) to fail FeedbackLogRecord.model_validate(...); read_recent_feedback_logs then silently skips those files, so /v1/dashboard can lose historical records immediately after deploy. Adding a default/optional value here (or an explicit migration path) would preserve dashboard visibility for pre-existing logs.
Useful? React with 👍 / 👎.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ensures OpenClaw and API logs are captured on failure, since the cleanup trap was destroying containers before the CI log-dump step could read them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Latest openclaw requires gateway.controlUi.allowedOrigins when binding to non-loopback. We don't need the Control UI in Docker, so disable it to prevent gateway startup failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
add_generation_prompt=Truein both Tinker (_build_sample_datum) and local/Modal (_build_self_teacher_topk) teacher prompt construction — the<|im_start|>assistant\nturn delimiter was missing after the refactor to pre-tokenized data flow, causing strongly negative advantages for early response tokensteacher_scored_textsin distill metadata across all three backends; render per-sample teacher scored text in the feedback dashboard (old logs without the field are handled gracefully)add_generation_prompt=True, verifyteacher_scored_textsin metadata for single/batch flows, and test dashboard rendering with and without the new fieldTest plan
uv run ruff check claas/ tests/ --fix— all checks passeduv run ty check— all checks passeduv run pytest tests/ -v -m "not integration"— 160 passed, 5 deselected🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation